June 28, 2018

Recommended Summer Reading (Non-Technical)

Overview

  • Methods for estimating causal effects: how to answer the question of "What is the effect of A on B?"
  • Randomized designs
  • Alternative designs when randomization is infeasible: matching methods, propensity scores, regression discontinuity, and instrumental variables. When are they most feasible?
  • Broad spectrum of applications: health sciences, genetic studies, mental health, econometrics, public policy, education, social sciences … (That's why causal inference is very exciting)
  • But…we need to define causal questions first.

Question Set 1 (causal questions)

  • For a woman older >50 yrs, should she be getting regular screening for breast cancer?
  • Do citizens of Los Angeles die because of air polution?
  • What is the effect of heavy adolescent marijuana use on adult outcomes (e.g., earnings at age 40)?
  • How much mortality and other burden was due to tobacco industry's misconduct?
  • Does the Head Start program improve educational and health outcomes for children?
  • Does a "healthy marriage" intervention improve relationship quality?
  • How does the type of school affect a child's achievements later in life?

Question Set 2 (we will not address such questions)

  • Are parents more conservative than their children because they are older?
  • Is there an effect of gender in this regression?

How do Question Set 1 and 2 differ?

How do Question Set 1 and 2 differ?

(Hint: "effect of cause" or "cause of effect")

A causal question is a problem with a manipulable intervention.

Causal Inference

  • Important (and hot) topic right now
  • Comparative effectiveness
  • Debates regarding study design: "efficacy" versus "effectiveness"; observational versus randomized experiments
  • Analytic challenges on modern study designs.
    • Community-level interventions (highway billboards, vaccination, new program…); matched-pair cluster-randomized trials (Wu et al., 2014, Biometrics; It's me)
    • Facebook A/B testing which ordering of ads/friends' posts makes you click

Today's Objectives

  • Be able to formalize causal effect discussions
  • Understand key elements of causal inference
  • Resolve the Lord's paradox

What do we mean by a causal effect?

  • What is the effect of some "treatment" \(T\) on an outcome \(Y\)?
    • Effect of a cause rather than cause of an effect
    • \(T\) must be a particular "intervention": something we can imagine giving or withholding
    • e.g. smoking a pack a day on lung cancer, Good Behavior Game on children's behavior and academic achievement

Key Elements in Rubin's Causal Model (Rubin, 1974, Journal of Educational Psychology)

  • Units, at a particular place and time
  • Treatments/interventions to compare (e.g., \(T=0\) for standard, \(T=1\) for new treatment)
  • Potential outcomes, e.g., \(Y_i(1), Y_i(0)\) are the outcomes that would be observed on the same subjects if assigned new, or alternatively, if assigned standard treatment

  • Causal Effect (definition): comparisons of potential outcomes for the same subject or same groups of subjects.

Help us be very clear about the effects we are estimating. It helps create a data table with observed and unobserved potential outcomes.

Units

  • The entity to which we apply or withhold the treatment
  • e.g., individuals, schools, communities
  • At a particular point in time
    • Me today and me tomorrow are two different units
  • Example: adolescents, elderly people

Treatment

  • The "intervention" that we could apply or withhold
    • Not "being male" or "being black"
    • Think of specific intervention that could happen
    • Example: Body mass index (BMI); heavy drug use during adolescence; trained nurses in a clinic to help manage the care for elderly people
  • Defined in reference to some control condition of interest (!)
    • Defining control could be more difficult than the treatment
    • No treatment? Existing standard treatment?
    • Example: no or light drug use?

Potential Outcomes

  • The potential outcomes that could be observed for each unit
    • \(Y(T=1)=Y(1)\): the outcome that could be observed if a unit gets the treatment
    • \(Y(T=0)=Y(0)\): the outcome that could be observed if a unit gets the control
  • For example, your headache pain in two hours if you take an aspirin; your headache pain in two hours if not taking the aspirin
  • Example: earnings if are heavy drug user (\(Y_i(1)\)); earnings if not (\(Y_i(0)\))
  • Causal effects are comparisons of these potential outcomes
  • No causal inference when exsitence of both \(Y_i(1)\) and \(Y_i(0)\) makes no sense.

Setting

True "Data"

Observed Data

Two Types of Causal Effects

Suppose we randomize subjects to \(T=1\) (new intervention) versus \(0\) (control)

  • Estimate causal effect:
    • \(ATE = 1/N\sum_{i=1}^N Y_i(1)-Y_i(0)\)
    • That is: \(\left(\sum_{i: T_i=1}Y_i(1) - {\color{red}\sum_{i:T_i=1}Y_i(0)}+{\color{red}\sum_{i: T_i=0}Y_i(1)} - \sum_{i:T_i=0}Y_i(0)\right)/N\)
  • Problem: cannot observe both \(Y_i(1)\) and \(Y_i(0)\)
  • Your goal is to estimate
    • \({\color{red}\sum_{i:T_i=1}Y_i(0)}\) and
    • \({\color{red}\sum_{i: T_i=0}Y_i(1)}\)
  • What does randomization ensure?

Statistical Concepts for Learning about Causal Effects

  • Replication
  • The Stable Unit Treatment Value Assumption (SUTVA)
  • The assignment mechanism

Replication

  • Need to have multiple units, some getting treatment and some getting control
  • The number of potential outcomes grows

Stable Unit Treatment Value Assumption (SUTVA)

  • No interference between units: treatment assignment of one unit does not affect potential outcomes of another unit.
    • Agricultural experiments (guard rows)
    • Drug use of one individual does not affect earnings of other individuals
  • Only one version of each treatment
    • Lumping all "heavy" drug use together; estimating effect of any "heavy drug use"

Possible SUTVA Violation

Assignment Mechanism

  • Process that determines which treatment unit receives
  • Randomized experiments: Known (nice!) assignment mechanism
  • Observational studies: have to posit an assignment mechanism (determine/model why some individuals become heavy drug users)
    • Propensity score models the assignment mechanism

Assignment Mechanism

  • High-level: Assignment mechanism is the rule (possibly probabilistic) by which subjects get their actual treatments \(\{T_i\}\). The assigned treatments unmask the potential outcomes \(Y_i(T_i)\) (denoted by \(Y_i^{obs}\)), but mask the rest of potential outcomes, denoted by \(Y_i^{miss}\)
  • Central to causal inference (look for it when reading literature; many not mentioning this)
  • An extreme example: a doctor who always gives her patients the best treatment (no randomness given patient information and the doctor)
    • but over a population of doctors, the assignments can be summarized probabilistically.

Lord's Paradox

Two Contradictory Statisticians

Two Contradictory Statisticians

Who is right?

Consider the framework:

  • Units:
  • Covariates:
  • Potential outcomes:
  • Treatment:
  • Control:

Well, it depends

Consider the framework:

  • Units: students
  • Covariates: Sex, September weight
  • Potential outcomes: June weight under treatment and control
  • Treatment: University diet
  • Control: ???

Lord's observed data

Two control conditions

Recommended Reading

Common Objectives in Causal Inference Training/Reasearch

  • Understand causal problems as potential interventions and how they are different from association questions
  • Understand the framework to discuss/mathematicize causal inference (potential outcomes, assignment mechanism); graphical approaches (Pearl) not discussed here but hugely important for their intuitive appeals and robustness to probability specifications ("The Book of Why")
  • Understand designs (ways to efficiently collect useful data) and methods for analyzing these data to answer causal questions
  • Understand complications in causal studies, including missing data, noncompliance and many hidden biases.
  • Use these as powerful tools to critically review research with causal claims. And improve science!

Main Points Once Again

  • Causal inference is counterfactual
  • Causal effect is a function of \((Y(1), Y(0))\)
  • Causal inference requires estimation of unobserved reponses - it makes sense when the estimation does
  • Causal inference requires assignment mechanism
  • Assignment mechanism known in randomized studies; must be assumed or modeled in observational studies (propensity scores)
  • Causal inference makes assumptions, e.g., non-interference, etc.